General Punctuation
   HOME

TheInfoList



OR:

General Punctuation is a
Unicode block A Unicode block is one of several contiguous ranges of numeric character codes ( code points) of the Unicode character set that are defined by the Unicode Consortium for administrative and documentation purposes. Typically, proposals such as the ...
containing
punctuation Punctuation (or sometimes interpunction) is the use of spacing, conventional signs (called punctuation marks), and certain typographical devices as aids to the understanding and correct reading of written text, whether read silently or aloud. An ...
,
spacing Spacing may refer to: * ''Spacing'' (magazine), a Canadian magazine * Spacing effect in psychology; the opposite of cramming * The usage of spaces in typography ** Letter-spacing, the amount of space between a group of letters ** Line spacing or ...
, and formatting characters for use with all scripts and writing systems. Included are the defined-width spaces, joining formats, directional formats,
smart quotes In English language, English writing, quotation marks or inverted commas, also known informally as quotes, talking marks, speech marks, quote marks, quotemarks or speechmarks, are Punctuation, punctuation marks placed on either side of a word o ...
, archaic and novel punctuation such as the
interrobang The interrobang (), also known as the interabang (often represented by any of ?!, !?, ?!? or !?!), is an unconventional punctuation mark used in various written languages and intended to combine the functions of the question mark, or interro ...
, and invisible mathematical operators. Additional punctuation characters are in the Supplemental Punctuation block and sprinkled in dozens of other Unicode blocks.


Block

Several characters in this block are usually not rendered with a directly visible glyph. Ten
whitespace character In computer programming, whitespace is any character or series of characters that represent horizontal or vertical space in typography. When rendered, a whitespace character does not correspond to a visible mark, but typically does occupy an area ...
s U+2002 through U+200B (fixed ''en'' or ''em, em, em, em, em, figure'' and ''punctuation space'', variable ''thin'' or ''em'' and ''hair space'', fixed ''zero-width space'') and U+205F (''math medium'' or '' em space'') differ by horizontal width, while U+2000 and U+2001 (''en'' and ''em quad'') are effectively aliases of U+2002 and U+2003, respectively; another two, U+202F and U+2060 (ill-termed ''word joiner'') are variants of U+2009 or U+2004 and U+200B that prohibit line-breaks. Three zero-width characters U+200B through U+200D (''space, non-joiner'' and ''joiner'') differ in how they affect
ligation Ligation may refer to: * Ligation (molecular biology), the covalent linking of two ends of DNA or RNA molecules * In medicine, the making of a ligature (tie) * Chemical ligation, the production of peptides from amino acids * Tubal ligation, a meth ...
and shaping of adjacent letters such as contextual forms in Arabic. Eleven invisible characters U+200E, U+200F (''left-to-right'' and ''right-to-left mark''), U+202A through U+202E (''embeds, pops'' and ''overrides'') and U+2066 through U+2069 (''isolates'') control the directionality of text unless higher-level markup overrides them. There are explicit ''line'' and ''paragraph separators'' at U+2028 and U+2029.


Emoji

The General Punctuation block contains two emoji: U+203C and U+2049. The block has four standardized variants defined to specify emoji-style (U+FE0F VS16) or text presentation (U+FE0E VS15) for the two emoji, both of which default to a text presentation.


History

The following Unicode-related documents record the purpose and process of defining specific characters in the General Punctuation block:


References

{{reflist Unicode blocks